Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sure the HTTP call is never made inside a database transaction. #1099

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

GeoffreyHuck
Copy link
Contributor

@GeoffreyHuck GeoffreyHuck commented Jul 8, 2024

fixes #1081

Because we want to have fast transaction.

Now, we call the endpoint at the end of the current transaction.

Tricky part: How to pass the config endpoint to the DataStore

That was the trickiest part.

The first question is: Do DataStore need to know about how to schedule the propagation? I choose that the answer was yes because it was easier the fact we do it after a transaction means it makes sense to be aware of transactions.

Also, InTransaction is called in many places, and it doesn't make sense at all to pass the endpoint as a parameter to this function, because most transactions are not about propagations at all.

What was done: the request Context was already passed to the DataStore, but it wasn't used. So the endpoint is now stored in the request Context, in a middleware, and retrieved from the DataStore. It's the easiest way I found.

Alternatives

  1. Make the configuration available globally. It's always tempting, but if we start like this we might end up with more and more globals.
  2. Have another method called InTransactionWithPropagation with the endpoint as parameter, that does the same as InTransaction but with a propagation. The downside is that we would have to know which one to use at the start of the transaction. It seems to me that the propagation is a notion that appears at a lower level than the service.

Make sure the propagation types are unique when we call the endpoint

We also make the types of propagation, sent as a parameter when we call the propagation endpoint, unique. We don't have to, it would work without it, but it seems cleaner.

Review

Easier to review commit by commit. Details in the commit messages.

…because it calls an endpoint.

Signed-off-by: Geoffrey Huck <[email protected]>
…ntext in the DataStore.

We need the endpoint to call the async propagation. And it has to be from DataStore because we want to specifically do it after the current transaction.

Signed-off-by: Geoffrey Huck <[email protected]>
…t does.

Move it in the `database` package. Because otherwise we have a cyclic dependence error when calling it from the `database` package (database -> service -> auth -> database). This might be solvable in another way though, but it would require refactoring.

Signed-off-by: Geoffrey Huck <[email protected]>
…in the slice.

It's not a requirement, it would work without it, but it feels better. Also, the endpoint is called with the types, so it avoids redundancy there too.

Signed-off-by: Geoffrey Huck <[email protected]>
Signed-off-by: Geoffrey Huck <[email protected]>
…enAndRunListeners`. It makes more sense because the work that needs a propagation is done there already.

Signed-off-by: Geoffrey Huck <[email protected]>
…guration in the `request Context`.

@see #1099

Signed-off-by: Geoffrey Huck <[email protected]>
@GeoffreyHuck GeoffreyHuck changed the title WIP: Make sure the HTTP call is never made inside a database transaction. Make sure the HTTP call is never made inside a database transaction. Jul 8, 2024
Copy link

codecov bot commented Jul 8, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (4d1012c) to head (dd140bf).
Report is 22 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff            @@
##            master     #1099   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files          239       239           
  Lines        14462     14480   +18     
=========================================
+ Hits         14462     14480   +18     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@GeoffreyHuck GeoffreyHuck requested a review from smadbe July 8, 2024 12:52
Signed-off-by: Geoffrey Huck <[email protected]>
…chedule it asynchronously. Because we expect the database to be propagated after the command ends.

Signed-off-by: Geoffrey Huck <[email protected]>
…s is intended because in this situation, it shouldn't have a value.

Signed-off-by: Geoffrey Huck <[email protected]>
… propagation.

In case no endpoint is defined, like for the tests, it was run after, because in this case the propagation is run in sync.
When that happens, some tests don't pass.

Signed-off-by: Geoffrey Huck <[email protected]>
…l propagations when one is done, there is no distinction between with and without results propagation anymore.

@see #1100

Signed-off-by: Geoffrey Huck <[email protected]>
…l propagations when one is done, the permissions propagation is always done (because the results propagation is always done).

@see #1100

Signed-off-by: Geoffrey Huck <[email protected]>
@smadbe
Copy link
Contributor

smadbe commented Jul 9, 2024

I would prefer that @zenovich reviews this one.

@smadbe smadbe requested review from zenovich and removed request for smadbe July 9, 2024 15:33
@zenovich
Copy link
Collaborator

The first question is: Do DataStore need to know about how to schedule the propagation? I choose that the answer was yes because it was easier the fact we do it after a transaction means it makes sense to be aware of transactions.

I still believe that the answer is "no" as it was discussed days before on Slack. The reason is that DataStore is just a clever DB interface decorator. But probably we could rethink the meaning of DataStore.

And actually I need to study this a bit more from scratch, as, from my perspective (from what it looked like two years ago), mutating operations on the DB resulted in setting the DB into inconsistent state (because of the operations themselves and also because of MySQL triggers), the propagations fixed that by bringing the DB into the consistent state again. That was the reason of calling the propagations in the same transaction with the mutating operations.

So, first I would like to study the current situation.

@smadbe, is it an urgent task?

@zenovich
Copy link
Collaborator

zenovich commented Jul 15, 2024

The main thing:
If we schedule a propagation outside of a transaction, we cannot guarantee that data modifications already committed to the DB will be followed by a corresponding scheduled propagation. Indeed, let's consider a situation where, for instance, an AWS Lamba function processing a user request gets killed because of a timeout right after it has committed the data modifications to the DB, but before it has called another AWS Lambda function to trigger the propagation.

Such situations are really dangerous and hard to debug. They will definitely lead to production failures and DB inconsistency issues.

I would think 100500 times before doing this and finally would find another approach.

s.mustBeInTransaction()

triggersToRun := s.DB.ctx.Value(triggersContextKey).(*awaitingTriggers)
triggersToRun.SchedulePropagationTypes = utils.UniqueStrings(append(triggersToRun.SchedulePropagationTypes, types...))
Copy link
Collaborator

@zenovich zenovich Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the order of types doesn't matter, it would be much cleaner to store map[string]struct{} in triggersToRun.SchedulePropagationTypes instead of creating utils.UniqueStrings() and tests for it. Also, the map works faster. But a struct containing all the possible values as boolean fields (like 'awaitingTriggers') would be even better than the map: it would store only several booleans while the map stores strings. Also, the struct provides immediate access to its fields, while the map uses hashing. Also, with the struct, it is not possible to mistype a value which is an often issue with maps/slices.

If the order of types matters, UniqueString() breaks it anyway since maps don't preserve the order of keys.

@@ -262,10 +279,6 @@ func (s *DataStore) WithForeignKeyChecksDisabled(blockFunc func(*DataStore) erro
})
}

func (s *DataStore) IsInTransaction() bool {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a nice useful method.


return nil
})
mustNotBeError(err)
Copy link
Collaborator

@zenovich zenovich Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Public methods cannot return errors via panic, panic/recover pattern should be used only within a package (see https://go.dev/doc/effective_go#recover). Also, it is an "architecture decision" made in March 2019, which we have always been following since then.

// If endpoint is an empty string, it will be done synchronously.
func SchedulePropagation(store *database.DataStore, endpoint string, types []string) {
func StartAsyncPropagation(store *DataStore, endpoint string, types []string) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move this method into app/service. It can be called right from the endpoint handler in case when the async propagation is chosen. The database package is for database-related things only.

@@ -1,6 +1,8 @@
package utils
Copy link
Collaborator

@zenovich zenovich Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A package should not be called 'utils' as all the packages should have meaningful names. 'utils' is a classic example of bad package names. We should do something with it (see https://go.dev/blog/package-names)

Copy link
Collaborator

@zenovich zenovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the above

Make all propagations async
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make sure the call of the propagation endpoint is not inside a database transaction
3 participants